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What is claimed is: 

1 . Aprogramf or enabling a computer to realize a matrix 
processing method of a parallel computer in which a 
5 pluralityof processors andaplurality of nodes including 
memory are connected through a network, the method 
comprising : 

distributing and allocating one combination of 
bundles of row blocks of a matrix, cyclically allocated, 
10 to each node in order to process the combination of the 
bundles; 

separating a combination of bundles of blocks into 
a diagonal block, a column block under the diagonal block 
and other blocks; 
15 redundantly allocating the diagonal block to each 

node and also allocating one of blocks obtained by 
one-dimensionally dividing the column block, to each 
of the plurality of nodes while communicating in parallel ; 

applying LU decomposition to both the diagonal 
20 block and the allocated block in parallel in each node 
while communicating among nodes; and 

updating the other blocks of the matrix, using the 
LU-decomposed block. 

25 2. The program according to claim 1, wherein 



49 



the LU decomposition is executed in parallel by each 
processor of each node in a recursive procedure. 

3, The program according to claim 1, wherein 
5 in said update step, while computing a row block, 

each node transfers data that belongs to a computed block 
and is needed to update other blocks, to other nodes 
in parallel to the computation. 

10 4. The program according to claim 1, wherein 

said parallel computer is a SMP node 
distributed-memory type parallel computer in which each 
node is a SMP (symmetric multi-processor) . 

15 5. A matrix processing device of a parallel computer 
in which a plurality of processors and a plurality of 
nodes including memory are connected through a network, 
comprising: 

a first allocation unit distributing and 
20 allocating one combination of bundles of row blocks of 
a matrix, cyclically allocated, to each node in order 
to process the combination of the bundles; 

a separation unit separating a combination of 
bundles of blocks into a diagonal block, a column block 
25 under the diagonal block and other blocks; 
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a second allocation unit redundantly allocating 
the diagonal block to each node and also allocating one 
of blocks obtained by one-dimensionally dividing the 
column block, to each of the plurality of nodes while 
5 communicating in parallel; 

an LU decomposition unit applying LU decomposition 
to both the diagonal block and the allocated block in 
parallel in each node while communicating among nodes; 
and 

10 an update unit updating the other blocks of the 

matrix using the LU-decomposed block. 

6 . A matrix processing method of a parallel computer 
in which a plurality of processors and a plurality of 
15 nodes including memory are connected through a network, 
comprising : 

distributing and allocating one combination of 
bundles of row blocks of a matrix, cyclically allocated, 
to each node in order to process the combination of bundles 
20 of blocks; 

separating a combination of bundles of blocks into 
a diagonal block, a column block under the diagonal block 
and other blocks; 

redundantly allocating the diagonal block to each 
25 node and also allocating one of blocks obtained by 
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one-dimensionally dividing the column block, to each 
of the plurality of nodes while communicating in parallel ; 

applying LU decomposition to both the diagonal 
block and the allocated block in parallel in each node 
5 while communicating among nodes; and 

updating the other blocks of the matrix, using the 
LU-decomposed block. 

7. A computer-readable storage medium on which is 
10 recorded a program for enabling a computer to realize 
a matrix processing method of a parallel computer in 
which a plurality of processors and a plurality of nodes 
including memory are connected through a network, the 
method comprising: 
15 distributing and allocating one combination of 

bundles of row blocks of a matrix, cyclically allocated, 
to each node in order to process the combination of the 
bundles ; 

separating a combination of bundles of blocks into 
20 a diagonal block, a column block under the diagonal block 
and other blocks; 

redundantly allocating the diagonal block to each 
node and also allocating one of blocks obtained by 
one-dimensionally dividing the column block, to each 
25 of theplurality of nodes while communicating inparallel; 
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applying LU decomposition to both the diagonal 
block and the allocated block in parallel in each node 
while communicating among nodes; and 

updating the other blocks of the matrix using the 
LU-decomposed block. 



